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It is shown that the fiducial distribution in a group model, or 
more generally a quasigroup model, determines the optimal equiv- 
ariant frequentist inference procedures. The proof does not rely on 
existence of invariant measures, and generalizes results corresponding 
fT^ . to the choice of the right Haar measure as a Bayesian prior. Classi- 

cal and more recent examples show that fiducial arguments can be 
, used to give good candidates for exact or approximate confidence 

distributions. It is here suggested that the fiducial algorithm can be 
pH ' considered as an alternative to the Bayesian algorithm for the con- 

, struction of good frequentist inference procedures more generally. 

oo 

1. Introduction. Fiducial theory was introduced by Fisher (1930) to avoid the 
^ , problems related to the choice of a prior distribution. Fiducial inference has not gained 

\ much popularity as such, but the related theory has been historically influential (Efron, 

r~| ' 1998), and is still important in the current flow of statistical developments (Efron, 2006; 

Lidong et al, 2008; Ghosh et al., 2010; Eraser et al., 2010; Wang et al, 2012). Fisher's 
own view on fiducial inference evolved over the years as can be inferred from a reading of 
his initial (Fisher, 1930, 1935) and more final formulation of the theory (Fisher, 1973). 
He was in particular very positive to the developments by Eraser (1961a, b, 1962, 1964), 
^ \ and we most certainly share this point of view. Eraser (1968, 1979) develops the theory 

\ and uses the label structural inference for this. A strongly related theory was presented 

under the label oi functional models by Bunke (1975) and Dawid and Stone (1982). The 
term fiducial will here be used more generally so that it includes structural, functional, 
and the original fiducial arguments given by Fisher. 
■ The original idea of Fisher was to obtain the fiducial distribution directly from the 

cumulative distribution, but this line of argument runs into problems when similar ar- 
guments are tried in the multivariate case. The definition used here is based on the 
solution of a fiducial equation, and is in this sense similar to the approaches of Eraser 
X ■ (1968), Dawid and Stone (1982), and Hannig (2009, 2012). A more precise definition of 

, the term fiducial model as used here is given in Section 2 in Definition 1 . A brief review 

of alternative, but strongly related definitions found in the literature is given in the final 
Section 4. 

Let / = 7(0, a) denote the realized loss for an action a G VLa given the model parameter 
9 G il0. Let VLx be the sample space equipped with the u-field £x of events. The risk p of 
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a decision rule 6 : Q,x — ^ is by definition equal to the expected value p = E ^{9, 6{X)). 
This is determined by the statistical model given by the family {{^}x,£x,Px) I ^ ^ ^©1 
of probability spaces. 

Consider now the more special case where i^x = = = G, possibly after a 
suitable change of variables. Assume that G is a measurable quasigroup with a unit e, 
and product ((71,52) ^ 9i92 written like ordinary multiplication (Smith, 2006). This 
includes the more common case of a group, but it is more general since the associative 
law is not assumed to hold. Assume furthermore that X ~ 9U conditionally on = ^ 
and that the law oi {U \ Q = 6) does not depend on 9. 

This gives an example of a fiducial model for the statistical model as defined more 
generally on page 4. The fiducial distribution is obtained by solving the fiducial equation 
X = 9u for 9 when u is sampled from P^. Existence and uniqueness is ensured since G is a 
quasi-group. A variable 0^ is uniquely determined by x = Q^U. The fiducial distribution 
is then the conditional law of 0^ given Q = 9. 

Assume that the loss is invariant in the sense that j{9,a) = 'y{g9,ga), and that the 
decision rule is equivariant in the sense that S{gx) = g5{x). The assumptions ensure the 
validity of the following calculation 



(la) 


p = E'^{9,6{X)) 


Risk 


(lb) 


= E'j{9,6{9U)) 


Fiducial model for P' 


(Ic) 


= E'^{9,95{U)) 


Equivariance of 6 


(Id) 


= E'j{e,6{U)) 


Invariance of 7 


(le) 


= E^j{&-',e''6{U)) 


Invariance of 7 


(If) 


= ES(0^'5(e^c/)) 


Equivariance of 6 


(Ig) 


= i^;^(e^5(x)) 


Fiducial equation 



A variation of the above argument gives that 0^' can be replaced by xU~^ in the conclu- 
sion. In the group case the law of 0^ will coincide with the law of xU~^, but in general 
not since the defining equation e = UU^T^ of the right inverse does not provide the 
solution of the fiducial equation. It follows from this that an optimal equivariant rule, 
if it exists, is determined by the fiducial distribution of 0^ or by the distribution of 
xU~^ from the right inverse. The first part of the claims in the abstract has hence been 
established. 

It is hoped that the reader can appreciate the simplicity and consequence of the 
calculation given in equation (1), but it could also be considered to be essentially Greek. 
The required theory of decisions and fiducial theory will be explained in some more detail 
in Section 2, and examples are presented in Section 3. The presentation is essentially as 
given in standard textbooks (Lehmann and Casella, 1998; Lehmann and Romano, 2005; 
Schervish, 1995; Berger, 1985; Stuart et al., 1999), but with the simplifications given by a 
fiducial model. The monographs by Eaton (1989) and Wijsman (1990) are recommended 
as excellent sources for theory and examples beyond the standard textbooks. 

The presentation in the following will be mostly restricted to the group case, but it 
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will be more general than the previous in the sense that the assumption of equality of 
the involved spaces will be abandoned. It will be more general than standard theory 
since, as above, the arguments will not depend on existence of invariant measures. 

2. Optimal inference. Consider the case where the loss of an action a € is 
of the form / = j(9,a) corresponding to a statistical model {P^ I ^ S i^e}- It is here 
assumed that the model parameter is a cr-finite random quantity and this and all other 
random quantities are defined based on the underlying conditional probability space 
(il,i5,P) as explained by Taraldsen and Lindqvist (2010). This means in particular that 
P^(i?) = F{X £ B \ <d = 9), and X : O — ^ Qx, 6 : — ^ are measurable functions. 
It means also that all expectations that occur are defined by integration over $7. As 
an example E{(j){Z) \ T = t) = f (j){Z {to)) (duj) by definition. It is here assumed that 
(p : — >• K, Z : — )• Qz, and T : — >• are measurable. The conditional law P* is well 
defined if Pt is u-finite. The consequence E{(j){Z) \ T = t) = J (j){z) P^z{dz) is a theorem. 

The law Pe of is not assumed known and is not needed in the arguments which 
follow. The reason for the assumption of existence of Q, X, and indeed any random 
quantity involved in the arguments, as functions defined on the conditional probability 
space (0, P, £) is as in the formulation of probability theory given by Kolmogorov (1933): 
Any collection of random quantities gives a new random quantity with a well defined 
law, and measurable functions of random quantities give new random quantities. The 
theory is completely based on the underlying abstract space 0. 

A group invariant problem is given by a group G that has a transformation group ac- 
tion on the sample space ilx, the model parameter space r^e, and the action space Qa- 
The problem is group invariant if Pgx — ^x ^^"^ 7(5^; 9^') — 7(^5 o)- inference rule 6 
with a corresponding action A = S{X) is equivariant if S{gx) = gS{x). The restriction to 
the class of equivariant actions can be interpreted as a consistency requirement: An ob- 
servation X from P^ corresponds to an observation gx from P^. The two corresponding 
problems are formally identical and the use of an equivariant action ensures consistency. 

The problem considered here is to determine an equivariant 6 such that the risk 



is minimized. It will be assumed that G = Q@ with the action given by the group 
multiplication g9 directly. The orbit of x in Qx is defined by Gx = {gx\ g £ G}, and 
likewise for orbits in fie and Qa- The action of G is free on ilx if the mapping g gx is 
injective for all x. The group action is transitive on fix if Gx = fix- If the group action 
is both transitive and free, then it is said to be regular and the corresponding space 
is then a principal homogeneous space for G. It follows in particular that the model 
parameter space fie is a principal homogeneous space for G, but there has also been an 
identification of the identity element e in fie- 

Let U he a random quantity such that P{U £ A\@ = 9) = P{X G A\Q = e) holds 
identically for all A and 9. It follows then that 





(3) 
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since the group invariance of the statistical model justifies — ^ex ~ ^eu- This 
construction proves that (U, x) with 

(4) x{u,e) = du 

is a fiducial model for P^. The concept of a fiducial model is defined as follows. 

Definition 1 (Fiducial model). Let Q be a a -finite random quantity. A fiducial 
model {U, C) is given by a random quantity U and a measurable function : Qjj x 
r^e — ^z- This is a fiducial model for the statistical model {P^ | 9 G ^0} if 

(5) (C(f/,e)|e = ^)~(z|e = 0) 

The notation (VFi | 6 = 6*) ~ (M^2 I © = 0) means that P^^ = P^^ so Definition 1 
can be compared with equation (3). It is allowed in the above that P^ does depend on 
9. Interesting examples where this occurs are discussed by Fraser (1979) in the form of 
dependence on shape parameters in addition to pure group parameters. In the following 
it will, however, be assumed throughout that the fiducial model is conventional in the 
sense that P^ does not depend on 9. 

It is important to notice that many different fiducial models are possible for a given 
statistical model. A fiducial model provides a different basis for statistical inference than 
a statistical model. The choice of a particular fiducial model can be compared with the 
choice of a Bayesian prior together with a statistical model. Fiducial inference is then 
initially different from frequentist and Bayesian inference since the inferential basis is 
given by a fiducial model which is assumed known. Fiducial inference as such will not 
be considered here, but the corresponding fiducial algorithms will be used as vehicles for 
the construction of frequentist procedures. 

A fiducial model {U, C) is simple if the fiducial equation ^(n, 9) = z has a unique 
solution 6^{u) when solved for 9 for all u,z. In the simple and conventional case the 
fiducial distribution is defined as the distribution of = 9^{U) conditional on Q = 9. 

Definition 2 (Fiducial distribution in the simple and conventional case) . Let {U, C) 
be a conventional simple fiducial model. Define the random quantity 0^ by z = CiU, O^). 
The fiducial distribution is the conditional law of given Q = 9. 

The fiducial model {U, x) given by equation (4) is simple if and only if Qx is a 
principal homogeneous space for G. In this case, by the choice of a unit element in i^x, 
the identification G = ^Iq = ^Ix can be done. It follows then that 0^(ti) = xu~^ is the 
unique solution oi x = 9u, and the fiducial distribution is the conditional distribution of 
0^ = xU"^ as it appears in equation (Ig). 

The remainder of this section will be on the analysis of the group model by means of the 
constructed fiducial model given by equation (4) in the case where Qx is not assumed 
to be a principal homogeneous space for G. The aim is to determine an equivariant 
inference rule 6 so that the risk given by equation (2) is minimized. A definition of a 
fiducial distribution will also be presented for this group case. The resulting distribution 
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coincides with the distribution described with many more examples, exphcit calculation 
of densities, and illustrative figures by Fraser (1968, 1979). 

A first observation is that the calculations given by equations (la)-(ld) are valid and 
the risk is given hy p = E{'y{e, 6{U)) \ @ = 9). The construction of the fiducial model 
has hence given a simple proof that gives that the risk does not depend on the model 
parameter since does not depend on 9. 

Let Y = 4>{X) be an invariant statistic in the sense that (p{9x) = (f){x) for all 9, x. 
This is equivalent with the requirement that <j) is constant on all orbits in the sample 
space Ox- The fiducial model in equation (4) gives that Y = (j){X) ~ (/)(0C/) = <j){U) 
conditionally on G = ^. The conclusion is that Py does not depend on 9 and has a 
known distribution. This proves that an invariant statistic Y is ancillary. 

Assume furthermore that Y = (p{X) is a maximal invariant statistic. This means that 
the family of level sets of (j) coincides with the family of orbits in Qx- Let x be given and 
assume that y = (j){x) = (j){u). The maximality ensures that x £ Gu = Qqu, so x = 9^u 
for some 9^. This 9^ will be unique if G acts freely on but here it will more generally 
be assumed that 9^ is determined by the choice of a measurable selection. The measurable 
selection theorem (Castaing and Valadier, 1977) ensures existence under mild conditions. 
The fiducial distribution of the corresponding variable can be described as follows. 

Definition 3 (Fiducial distribution in the group case) . Let u be a sample from the 
distribution of (U\Q = 9,4>{U) = (^(x)) where (j) is a maximal invariant. Let 9^ be a 
measurable selection solution of x = 9^u. This 9^ is a sample from a fiducial distribution. 

The solution 9^ exists since y = </'(x) = 4'{u) ensures that x and u are on the same 
orbit. Definition 3 is a special case of Definition 2 if fix is a principal homogeneous 
space for G, and the definitions are hence consistent. It is possible to define a fiducial 
distribution for more general cases. One version is as presented by Hannig (2009), but 
there are also other possibilities available. This will not be discussed further here since 
the given definitions of the fiducial distribution are sufficient for the purposes in this 
paper. 

Let Y = 4'{X) be a maximal invariant statistic. The calculation that gave equa- 
tion (Id) can now be continued to give 



The expression [•] does only depend on y. The optimal rule 5, if it exists, is found by 
minimization for each given y. Assume that x is such that y = (j){x). It follows then that 



and the optimal rule 5 is determined by the fiducial distribution of . The variable 
is defined as a measurable selection solution of x = Q^U . This result can be summarized 
as the main technical result in this paper. 



(6) 




(7) 



E''y-^{e,5{U)) = S''•^7(G^ Q^<5(C/)) = ^''^7(G^ <5(2;)) 
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Theorem 1. The risk of an equivariant rule in a group invariant problem is deter- 
mined by a fiducial distribution if the model parameter space is a principal homogeneous 
space for the group. 

It should be noted that the statement assumes existence of a fiducial distribution 
as described above, but uniqueness of a fiducial distribution is not assumed. Optimal 
inference procedures are determined by the fiducial distribution regardless of the choice 
of a measurable selection for the determination of a fiducial distribution. The optimal 6 
is found, if it exists, as the minimizer 6{x) of the expression 

(8) ^^'^7(G^,5(x)) 

where the conditional distribution of @^ is a fiducial distribution. 

Theorem 1 generalizes directly to the larger class of randomized equivariant actions. 
This is obtained by a replacement of the equivariant action d{X) by the randomized 
equivariant action 6{X, V) = 6{9U, V) in the calculations. It is here assumed that U and 
V are conditionally independent in the sense that Ffjy{du,dv) = P^(dii) Py(dv), and 
both conditional distributions do not depend on 9. The equivariance is defined by the 
identity S{gX, V) = g5{X, V). 

A randomized action corresponds to the assignment of a probability measure on the 
action space VLa- The set of randomized actions is hence always a convex set, and this 
gives theoretical advantages to the problem of minimization of the risk. If, however, the 
loss function 1(9, a) is convex on for each 9, then the Jensen inequality gives that it 
is sufficient to consider non-random actions (Lehmann and Casella, 1998, p. 48). 

Theorem 1 generalizes also directly to the case where G is only assumed to act tran- 
sitively on r^e- The construction is as above, and starts with fixing a 9q and the con- 
struction of a random variable U such that Pfj = P^. All the arguments given above 
can then be repeated with G playing the role of a new and possibly larger parameter 
space. The result is then first a fiducial distribution on G, but this is pushed forward to 
a fiducial distribution on by the mapping g i— )• g9Q. 

It is known that the fiducial coincides with the posterior from a right Haar prior, and 
for these cases Theorem 1 is a known result with the posterior used in the formulation in- 
stead. There are, however, groups where no Haar prior exists, and in this case Theorem 1 
and its extensions given by the above comments is a novel result. The derivation given 
in the introduction also gives a similar result in the more general case of a quasi- group, 
and the existence of invariant measures is then also not automatic. 

3. Examples. The examples presented next are chosen to illustrate the concepts. 
Many more examples and thorough discussions are found in the previously quoted text- 
books and monographs. A complete treatment of the given examples - including in 
particular simulation studies of the resulting procedures - will not be pursued since this 
would tend to take attention away from the main issue. The purpose is simply to indicate 
the usefulness of fiducial theory. 
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3.1. The Bernoulli distribution. A random sample of size n from the Bernoulli dis- 
tribution provides an example where the results related to Theorem 1 can not be applied 
directly. Fiducial theory can, nonetheless, be used to obtain optimal inference. 

The largest possible group G equals {e,gi} corresponding to the group of permutations 
of the set {0, 1}. The action on Qq = (0, 1) is determined by gip = 1 — p, and the set 
of orbits in the parameter space is uncountable. The conclusion is that conditioning on 
the maximal invariant as in the arguments leading to Theorem 1 does not provide any 
essential simplification of the problem. 

This example is, however, very important from the point of view that fiducial distri- 
butions can still provide optimal procedures. Blank (1956) has constructed a randomized 
most powerful unbiased confidence interval, and this is related to a fiducial distribution 
(Anscombe, 1948; Stevens, 1950, 1957). 

The empirical mean is the unbiased estimator of p with minimum variance. It can, 
however, be argued that neither unbiasedness nor minimum variance are natural concepts 
in this particular case. The parameter space Q@ can alternatively be identified with the 
circular arc {(t/p, y^) \p,q > 0,p + q = 1} in the {^yp, y^)-plane. This has the advantage 
that the Fisher information metric distance between two distributions in this parametric 
family equals the distance along the arc (Rao, 1945; Atkinson and Mitchell, 1981; Amari, 
1990). The distance squared provides a loss that is invariant with respect to G. A natural 
task is to investigate on existence of an optimal equivariant estimator of p with respect 
to the distance squared on the arc. A reasonable candidate arises from the previously 
referenced fiducial distribution, but this will not be discussed further here. 

3.2. The Octonions. The purpose here is to give an example which does not involve 
a group and where the argument given in equation (1) provides a fiducial distribution 
that can be used for the determination of the possibility of an optimal decision rule. The 
octonions is here used as an example since it is one of the more interesting examples 
of a group-like structure where the associative law fails. It has also a natural invariant 
loss that can be used in the arguments that follow. A more familiar example without 
associativity can be constructed for the original model of Fisher for the correlation 
coefficient, but we have not been able to identify a natural invariant loss in that case. 

The Cayley-Dickson construction defines a multiplication (a, &)(c, d) = (ac — d*b, da + 
be*) and an involution {a,b)* = (a*,— 6) on ^ x ^ where A is an algebra with an 
involution. Starting with the real numbers M this gives the complex numbers C. Repeated 
application of the construction gives then next the quaternions EI and then next the 
octonions O. The octonions is hence equal to the 8-dimensional vector space equipped 
with a particular multiplication operation so that O is an algebra (Baez, 2002). 

The number 1 is the unit for multiplication, and every nonzero element x has a 
multiplicative inverse with 1 = xx~^ = x~^x. The usual norm on is also given by 

1 1 1 1 2 

the product and involution *, and the identity \\xy\\ = \\x\\ \\y\\ holds. 

It follows in particular that x~^ = x* / \\x\\'^. The multiplication is not associative, but the 
algebra O is alternative: The subalgebra generated by any two elements is associative. 

Consider next a fiducial model (U, x) where x = x(n, 9) = 6u is given by the product 
in O, and where the conditional law is specified and does not depend on 6. Assume 
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that Qx = = ^& = = G where G is a subset of O that contains 1, the product 
of any two elements, and the inverse of any element. The particular examples where G 
is the nonzero octonions or where G is the octonions with unit norm provide examples 
where G is not a group since the associative law fails. This is then a fiducial model for 
a statistical model {P^ I ^ ^ ^e} where {X | = 0) ~ {9U \ Q = 9). The corresponding 
fiducial distribution is the conditional distribution of 0^ = xU~^ given Q = 9. Consider 
the case where the loss of an action a is given by j{9, a) = \\9 — a||^ / ||^||^. This loss is 
invariant, so the calculation in equation (1) gives that the risk of an equivariant decision 
rule is given by E^'y{Q^,6{x)). 

Existence of an optimal estimator depends on Pfj or equivalently on 0^', and this will 
not be discussed further here. It can, however, be noted that any optimal equivariant 
decision rule is determined by 6{x) = x6{l), and 5(1) is the minimizer of E^j{Q^,6{l)). 
A rule on this form will be equivariant if 6{1) belongs to the set {a G G\{gig2)a = 
51 (52a) V5i,52 G G}. 

There are many other examples of binary operations that are not associative. A generic 
family of examples are produced by a relationship x = x{u, 9) that has the property (*): 
It gives a one-one correspondence between the domains of any two of the variables when 
the value of the third is fixed. Corresponding fiducial models based on x defines the class 
of simple pivotal models in accordance with the terminology of Dawid and Stone (1982, 
p. 1057). Concrete elementary examples are provided hy x = u — 9, x = u9~^, and x = 
on suitable domains. 

The property (*) is conserved by a change of variables by one-one transformations 
resulting in 4>x{x) = xi4'uiu),(j)e{(^))- For the given elementary examples there exists a 
change of variables so that the result is a relation x = u9 given by a group multiplication. 
This is not possible in general. Simple counter examples arises for the Fisherian simple 
pivotal models determined by the relation u = F{x \ 9) where F is a suitable cumulative 
distribution function. The prototypical example used by Fisher (1930, p. 534) when he 
introduced the fiducial distribution is given by the sample correlation coefficient from a 
bivariate normal distribution. In this case a reduction to a group model as for the given 
elementary examples is not possible. 

In the general case starting from the property (*) there exists, however, a change 
of variables that results in a relation given by a quasi-group with a unit: A loop. The 
important conclusion of this short discussion is that the theory of simple pivotal models 
is linked naturally to the theory of loops. The nonzero octonions provides an example of 
a loop which is not reducible to a group by a change of variables. 

3.3. Hubert space. One purpose of this example is to demonstrate existence of a case 
where Theorem 1 can be used, but where an invariant measure does not exist. 

Let = il.A = G and ilx = = where G is a complex or real Hilbert space. 
The Hilbert space G is a group where the addition of vectors is the group operation, and 
an invariant loss is given by the squared distance between vectors as 7(0, a) = \\9 — a||^. 
A conventional fiducial model (x, U) is given by Xi = Xiiu, 9) = 9 + Ui for i = 1, . . . , n 
and a specification of a distribution that does not depend on 0. A maximal invariant 
is given by y = {x2 — xi, . . . , x^ — xi). The fiducial distribution is given as the distribution 
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of 0^ = xi — Ui from the conditional law (U \ Q = 9, (JJ2 — C/i, . . . , C/„ — C/i) = y). The 
optimal estimator of 9 is given as 6{x) = xi — E{Ui \ Q = 6, {U2 — f/i, . . . , C/„ — Ui) = y). 

It will be demonstrated in the next subsection that it is not necessary to assume 
independence of {Ui} in the previous argument, and this assumption has indeed not 
been mentioned above. More important is the fact that a right Haar prior does not exist 
in the case where G is an infinite dimensional Hilbert space. An explicit example is given 
by G = /2(N) = {(a,) | \\af = ^Zi W^f < 

The previous example has an infinite dimensional parameter space, and this feature 
is quite common in applications as exemplified by non-parametric statistics. The exam- 
ple does also include data that are infinite dimensional, and this can be considered to 
be unrealistic in applications. There are, however, applications where it is nonetheless 
convenient to assume that the observations are also infinite dimensional. An important 
source of examples is given by the statistical signal processing literature (Van Trees, 
2003). Explicitly, it can be convenient to assume that a signal is observed not only at 
discretely sampled points, but for all points. Similarly, it can be convenient to assume 
that a complete infinite sequence of sampled points is observed, even though only a finite 
number of samples are actually observed. In both cases this can lead to a sample space 
that is not finite dimensional. A related and very common convenience is to assume that 
observations are given by real numbers, even though the majority of concrete examples 
actually only involves a finite set of observable values due to limited instrument res- 
olution (Taraldsen, 2006). Explicit consideration of the limit from discretized data to 
continuous data gives, incidentally, a most promising route for the definition of fiducial 
distributions more generally than considered in Section 2 as demonstrated recently by 
Hannig (2012). 

If one observes the real random variables Xi , . . . , Xn independently normally dis- 
tributed with unknown mean 9 = {^i, . . . , ^n) and variance 1, it is customary to estimate 
fii by Xi. If the loss is the sum of squares of the errors, this estimator is admissible for 
n < 2, but inadmissible for n > 3 (Stein, 1956). The optimal estimator derived above 
coincides with the customary estimator. This exemplifies that the optimal estimator can 
be inadmissible. The optimality is only ensured within the class of equivariant estima- 
tors. Equivariance can be a most natural demand, but this depends on the particular 
concrete modeling case at hand. In certain situations (Efron and Morris, 1977) it can be 
natural to give away the equivariance demand in order to obtain more precise estimates. 
In other cases, especially in the context of physics, the equivariance demand can be closer 
to the foundation of the subject matter and will be an absolute demand. 

3.4. Uniform distribution. A particular case of the previous Hilbert space example is 
given by assuming G = M and where corresponds to a random sample of size n from 
the uniform distribution on (0, 1). This gives then a fiducial model for a random sample 
from the uniform distribution on (9,9 + 1). A fiducial distribution and a corresponding 
optimal estimator of 9 follows from the Hilbert space argument. An alternative and more 
geometrically tractable argument follows as explained next from the use of the sufficient 
statistic given by the maximum and minimum observation. 

Let Xi = 9 + Ui where the joint distribution of (^1,^2) conditional on = ^ is given 
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by the density f{u \ 6) = n{n — l){u2 — ui)^~'^ on {(^1,1*2) | < ui < ^2 < 1}. This is 
then a fiducial group model for the sufficient statistic given by the smallest and largest 
observation from a random sample from the uniform distribution on {0,9 + 1). The 
model is also a special case of the Hilbert space example with n = 2 and where {Ui} 
are conditionally dependent given Q = 6. Reduction by sufficiency has here simplified 
the problem, but the fiducial equation is still overdetermined so a further reduction by 
the maximal invariant y = X2 — xi is necessary. The resulting conditional distribution 
(C/i I = 9,1/2 — Ui = y) becomes the uniform distribution on (0,1 — y), and the 
fiducial distribution of 0^ becomes the uniform distribution on (x2 — This is 

also a confidence distribution for 9. The optimal estimator for 9 given the invariant loss 
\9 - is 6{x) = (xi + X2)/2 - 1/2. 

We choose to add a few comments on this model and estimator since it has some 
unusual features. A first observation is that the Fisher information metric fails to exist 
due to nonexistence of the required derivative. The corresponding distance between 
two distributions can, however, still be defined through the length of the parametric 
curve 9 1— )■ /(• | 9) in the Hilbert space of square integrable functions. This curve is 
continuous, but the length from 9i to 92 is infinite: It is larger than 2^fn\J\92 — 9\\ for 
any integer n. 

The squared distance \9\ — 92\ is the squared distance from the Fisher information 
metric for any location family where the density is smooth. Based on this we consider the 
invariant loss |0 — to be a natural choice also in the non-smooth example considered 
here. 

The optimal estimator b found above is unbiased and has hence minimum vari- 
ance in the class of unbiased and equivariant estimators. Nonetheless, according to 
Lehmann and Casella (1998, p. 87), there exists no uniformly minimum variance unbiased 
estimator of 9. The statistic {X\,X2) is a minimal sufficient statistic, but it is not com- 
plete. The estimator b is, however, the uniformly minimum variance unbiased estimator 
in the larger parametric family which also includes a scale parameter (Johnson et al., 
1994, vol.2, p. 292). This later reference is also a very good source for further references 
and peculiarities regarding the uniform law. 

3.5. Exponential. The following example is a scale example, and can be reduced to be 
a special case of the Hilbert space location problem by the logarithmic transformation. 
A direct solution is equally elementary and is presented to illustrate the derivation of an 
optimal estimator. The explicit formula for the estimator is possibly a novelty. 

Assume that Yi, . . . , 1^ is a random sample of size n from the exponential distribution 
with scale parameter (3. A fiducial model is given by 1^ = f3Vi where the law Py is as 
for a random sample from the standard exponential distribution. The arithmetic mean 
X = Y is a minimal sufficient statistic. A corresponding fiducial model is given by 
X = PV = (iU where is the law of a gamma variable with scale equal to 1/n and 
shape equal to n. This follows from well known properties of the gamma distribution. The 
model is both simple and conventional, and the fiducial distribution for an observation 
X = y is hence the conditional distribution of 0^ = x/U . The conclusion is that the 
fiducial distribution is the inverse-gamma with scale xn and shape n. 
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A direct - but more lengthy - calculation of the Bayesian posterior corresponding to 
the right Haar prior (i/3//3 gives a posterior that coincides with the fiducial distribution 
found here. It is well known more generally that the Bayesian posterior from a right Haar 
prior in a group model coincides with the fiducial. The calculation demonstrates then 
that a fiducial model and the solution of the fiducial equation gives an alternative and in 
many cases simpler route for the calculation of the Bayesian posterior. The multivariate 
normal gives another example where the fiducial calculation is done in a few lines, but 
the corresponding Bayesian calculation is much more cumbersome. 

An added advantage of the fiducial calculation is that it shows directly that the 
corresponding fiducial distribution is a confidence distribution. This is not easily obtained 
from the Bayesian calculation. The confidence distribution can alternatively be found 
by the likelihood ratio test, and this has the advantage of giving proof of optimality and 
corresponding optimal choices of confidence interval endpoints. An alternative approach 
is to also derive optimal intervals based on Theorem 1 as exemplified by Berger (1985). 

An alternative fiducial calculation can be done without the reduction to the complete 
sufficient statistic. A maximal invariant (p is given by = y/ ||y||. The conditional 
law {V \ Q = 9, (t){V) = <P{y)) will be concentrated on the ray Gy = {a(j){y) | a > 0} with 
a distribution from a density for a proportional to fv{a(l){y))a"'~^ . The assumption of 
a random sample from the standard exponential gives a particularly simple fv, and the 
fiducial is found explicitly as before. The alternative calculation has the advantage that 
it can be used in the more general case where reduction by sufficiency is not available. 

Consider now the problem of estimation of ^ = /5 with a loss given by 'j{9, a) = 
I In ^ — In a I . This loss is a natural generalization of the squared error loss, but with 
the ordinary distance replaced by the distance \ln9 — lna\ which is the distance given 
by the Fisher information metric in the case of the given scale model. In this case 
G = Qx = = = with multiplication as the group operation. The loss is 
equivariant, and it follows that the optimal rule 6 based on the sufficient statistic X is 
given as the minimizer oi p = |ln0^ — In This gives that the optimal rule is 

determined from \\i6{x) = E^\\iQ^. Evaluation of the corresponding integral gives an 
explicit formula for the optimal rule. It is 

(9) 5{x) = xexp(lnn — '4j{n)) 

where is the digamma function. The estimator given by equation (9) is possibly known 
in some contexts, but we have not found this explicit expression in any of the textbooks 
in the list of references or elsewhere. 

3.6. Gamma distribution. Assume that Yi, . . . , 1^ is a random sample of size n from 
the gamma distribution with scale parameter (3 and shape parameter a. The model 
parameter \s 9 = {a,f3). This gives an example as in subsection 3.1 where the results 
related to Theorem 1 can not be applied directly. Fiducial theory can be used to obtain 
candidates for good frequentist inference procedures as indicated next. Particular results 
include an exact joint confidence distribution for (a,/3), an exact confidence distribution 
for a, and a recipe which produces estimators for functions of (a, /3) that depends on 
the choice of a loss. 
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A fiducial model is given by 1^ = f3F~^{Ui; a), where the law Pjj corresponds to a 
random sample of size n from the uniform distribution on the unit interval (0,1), and 
F~^{u, a) is the inverse cumulative distribution function of a gamma variable with shape 
Q and scale 1. 

Let X = (Y,Y/Y) where Y and Y are the arithmetic and geometric means. The 
Bartlett statistic W = Y /Y depends only on a, and is independent of 1" as a consequence 
of the Basu theorem. A corresponding fiducial model {x, U) for is given by Xii'^^ ^) = 

fiF~^{u\a) and X2iu,0) = F~^{u;a)/F~^{u;a). It can be noted that {x2,U) gives 
separately a fiducial model for The corresponding fiducial distribution for a is 
hence a confidence distribution. 

An alternative fiducial model {i]2,V2) for is given by inversion of the cumula- 
tive distribution function for W. An alternative to F~^{u;a) is given by inversion for 
a gamma variable with shape na and scale 1/n. The combination gives an alternative 
fiducial model (r/, V) for Pjj- with the property that x = r]{v, 9) defines a one-one cor- 
respondence between any two variables when the third is kept fixed. The law Py is 
the uniform law on the unit square (0,1)^. Coordinate transformations can be used to 
identify G = VLq = Vty = as sets with a quasigroup structure with a unit. 

Both fiducial models (x, U) and (77, V) are simple and conventional and determine 
a fiducial distribution. For concreteness let 0^ be the fiducial corresponding to {ri,V). 
The quasigroup structure ensures that gives a joint exact confidence distribution for 



Consider the problem of estimation of a function r = h{a, (3) of the model parameter 
9 = {a, (3). It can be allowed that h is vector valued, but assume that each component 
is positive. Three examples that are included are then given by r = a, r = /3, and 
T = (a, /3). A possible loss in these three cases is given by squared error loss on a 
logarithmic scale. A candidate estimator 5 is then given naturally by 



This can be evaluated by Monte-Carlo simulation from Py which is the uniform distri- 
bution on the unit square [0, 1]^. Another possibility is given by squared distance loss 
defined by the Fisher information metric on VLq in the case h{9) = (a, (3). 

4. Discussion. The foundations of Bayesian and frequentist modeling and inference 
are well established both in terms of mathematical theory and interpretation. We do not 
think that the same can be said about fiducial theory, but some readers may object to 
this. A brief discussion of alternative formulations and naming conventions found in the 
literature seems hence to be in place. 

Definition 1 identifies a fiducial model with a pair (C/, C)- The fiducial model is by 
definition conventional if P^ does not depend on 9. In this case we suggest to denote U 
as the Monte Carlo variable and the measurable function C, as the fiducial relation. The 
corresponding equation z = C,{u, 9) is the fiducial equation, but it may also equivalently 
be denoted as the fiducial relation. A fiducial model (C/, C,) is hence defined by a Monte 
Carlo variable U and a fiducial relation C,. 



(a,/3). 



(10) 
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If u is a sample from the Monte Carlo distribution Pjj, then z = C{u,6) is a sample 
from the statistical model as in Definition 1. The inversion method gives the prototypical 
example with ("(u, 9) = F^^{u \ 6) and equal to the uniform law on the interval [0, 1]. 
This gives the link to the original definition of Fisher, and also a justification of the 
choice of the term Monte Carlo variable since this represents a standard method for 
simulation from a statistical model on a computer. 

The ingredients above given by the pair (U, (") are also the starting point for Dempster- 
Shafer theory (Dempster, 1968; Shafer, 1982). Martin et al. (2010) refer to U as the 
auxiliary variable and the probability measure /i as the pivotal measure, where U ^ fi. 
The equation Z = ({U, Q) is denoted the a-equation. The whole set-up is referred to as 
an inferential model, and this is identified as something which is not equivalent to a sta- 
tistical model. Except for differences in naming conventions it can be concluded that an 
inferential model is essentially the same as a conventional fiducial model as summarized 
in the previous paragraphs. The Dempster-Shafer calculus gives an alternative route for 
inference based on a fiducial model, but this is not discussed further here. 

The discussion of fiducial theory we have presented is close to the presentation given 
by Dawid and Stone (1982). Dawid and Stone (1982, p. 1055) use the term fiducial model 
for the combination of = 9^{U) and U ~ P^, and use the term functional model to 
describe the more general relation Z = ((U,@). We chose to avoid the term functional 
model since the term functional data analysis is now the name of a branch of statistics. 
Dawid and Stone (1982) denote the variable U as the error variable, and uses the symbol 
E instead. This corresponds to the naming convention used by Fraser (1968). Fraser 
(1968, p. 50) uses the terms structural model and structural equation in the case where 
group theory is an essential ingredient. Hannig (2009) uses the term structural equation 
in stead of the term fiducial equation as used by us. We have avoided the term structural 
here since there is an active and well developed different theory which goes under the 
label of structural equations modeling. Our preference for the term fiducial as used here is 
mainly based on economy of language, and since this gives the direct link to the original 
papers of Fisher. 

The mathematically inclined reader may claim that Definition 1 is not precise. This, 
and the fact that this definition is a novelty compared with previous writers, motivate 
us to state in more detail the assumptions that are taken as implicitly given from the 
context in the statement of Definition 1. The main difference is that every concept is 
based on an underlying abstract conditional probability space (0, £, P) as stated initially 
in Section 2. The fiducial relation is a measurable function : Qu x fi© — )• fi^. This 
means, as usual, that € A) = {{u,9) \ C{u,9) £ A} is a measurable set in the product 
(T-algebra of x fie whenever ^ is a measurable set in Qz- A consequence is that 
C{U, Q) is a random element in Q,z defined by the mapping uj i— )• ({U{uj), Q{uj)). This is 
measurable since it is assumed that : fi — t- fie and [/ : fi — t- fi(/ are measurable. The 
conditional law Pfj of the Monte Carlo variable U is known and does not depend on 9 
in the case of a conventional fiducial model. If the considerations were limited to the 
case where (fi,iS,P) is a probability space, then this would imply P^ = Pjj. This fails 
generally as explained in more detail by Taraldsen and Lindqvist (2010) since P^ is a 
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probability measure, but Pjj is unbounded if P is unbounded. The reason for allowing 
unbounded measures is the need to include improper priors Pe- This has proved fruitful 
in related ongoing research by the authors. It gives in particular natural conditions that 
imply equality of Bayesian posteriors and fiducial distributions. In specific modeling 
cases the spaces Qjj, ^Iz, ^0, the fiducial relation C, and the conditional law Pfj are 
all explicitly given. This is as demonstrated by the examples in Section 3. The other 
ingredients mentioned above are not given explicitly since they rely on the underlying 
unspecified space fi. This is as in the ordinary formulation of probability theory by 
Kolmogorov (1933) where the whole theory is built upon the underlying abstract space 
0. Existence must be proved in each specific modeling case, but follows trivially in many 
cases from the construction of a suitable product space. 

Optimal inference for the scale, the location, and the location-scale problems were 
investigated using fiducial theory by Pitman (1939). His presentation is most readable 
and is a good alternative to the presentations found in standard textbooks. It can, 
however, be noted that he concludes that the confidence and the fiducial theories are 
essentially the same. This is in contrast to the views of Neyman and Fisher. They 
seemed to agree that in principle the fiducial distribution as described by Fisher is not 
connected to the concept of confidence intervals as described by Neyman and co-workers. 
The content and aims of these two theories are different. It seems clear that Fisher never 
intended to get confidence intervals as the result of his fiducial arguments. 

It is true that the fiducial distributions found in the location-scale problems, and more 
general group problem as in Theorem 1, are confidence distributions, but we do consider 
the concepts to be essentially different in general. The interpretation of the fiducial 
distribution, according to Fisher (1973, p. 54, p. 59) is identical with the interpretation 
of the Bayesian posterior: It represents the state of knowledge regarding the model 
parameter as a result of the model assumptions and the observation in the experiment. 
It follows then in particular that the fiducial distribution of a function (j){6) of the model 
parameter 9 equals the distribution of ^(O^) where has the fiducial distribution. This 
property does not hold for confidence distributions in general. In addition, the fiducial 
distribution for a simple fiducial model as in Definition 2 is not a confidence distribution 
in general (Dawid and Stone, 1982). 

The possibly most famous fiducial distribution is the fiducial distribution of the dif- 
ference of means /xi — fi2 corresponding to two independent samples from two different 
normal distributions. This fiducial distribution gives Fisher's solution to the Behrens- 
Fisher problem, but it can be shown by simulation that it is not a confidence distribution 
in the sense of having exact coverage probabilities. A more general class of confidence 
distributions is defined by requiring not exact but conservative coverage probabilities. 
This is in conformity with the definition of confidence sets in general. Exactness is often 
misguidedly taken as a measure of goodness, but it is not. Power of the associated tests 
gives one natural measure of goodness. Examples demonstrate that this more general 
concept of a confidence distribution does not coincide with fiducial distributions in gen- 
eral, but it seems to be an open question whether the Behrens-Fisher fiducial distribution 
is a confidence distribution in this more extended sense. Numerical simulations indicate 
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that the claim holds (Robinson, 1976; Barnard, 1984, p. 269). 

The more general problem of obtaining a confidence interval for the linear combi- 
nation of several means from different normal distributions is of considerable practical 
importance (ISO/IEC, 2008). The ISO recommended solution is in terms of a Welch- 
Satterthwaite solution, but a continuation of the arguments given by Barnard (1984) 
leads to the conclusion that the fiducial solution is a most competitive alternative solu- 
tion. 

The main virtue of the location-scale models in the context here is that they illustrate 
very well the reduction given by a maximal invariant in cases where a reduction by 
sufficiency is not possible. This is also true for the multivariate models treated by Fraser 
(1979). In this case the multivariate normal can be reduced by sufficiency, but more 
general models can again be treated by a reduction through maximal invariants. It 
seems that optimal, or good, inference procedures in these multivariate cases deserves 
further study guided by fiducial theory. A recent example of this is given by Lidong et al. 
(2008), but there are a multitude of different possible examples as indicated by Fraser 
(1979). The suggestion given by Theorem 1 is that not only confidence intervals, but 
also other kinds of inference such as estimation should be considered. 

Eaton (1989, p. 89-91) considers the estimation of the covariance matrix from a mul- 
tivariate normal sample. He gives two possible candidates to use as a loss ^{6, a). This 
exemplifies that in the multivariate cases, and in more complicated group cases, it can be 
difficult to decide upon which equivariant loss to use. It can even be difficult to come up 
with a good candidate. In our examples it has been indicated that the squared distance 
from the Fisher information metric is a natural choice. This will be invariant under mild 
conditions. For a statistical model f{x \ 9) iJ,{dx) the distance is defined via the length 
of paths i I— >• x(t) = v/OI^W) Hilbert space L2(/^)- The nonpar ametric case 

given by a parameter space equal to all densities with respect to fi gives the distance 
dif,g) = cos~^(J y/Jgdfi). The other end of the scale is given by a smooth finite di- 
mensional parametric model. In this case the previous leads to the Fisher information 
metric: ds"^ = {l/4)gijde'd6^ where gtj = E% [dtln f{X \ 9)) (djln f{X \ 9)). In either 
case it gives the model parameter space as a manifold equipped with a distance derived 
intrinsically from the statistical model. 

The focus of fiducial theory has initially and currently most often been on the fiducial 
distribution by itself and the related possibility of construction of approximate or exact 
confidence intervals. The relation to other kinds of optimal inference such as estimation or 
prediction was considered by Mora and Buehler (1966, 1967). The proofs they presented 
rely on the existence of an invariant measure, and it was clear that the fiducial in the case 
they considered corresponded to a Bayesian posterior from the right Haar prior. Since 
then it has been established in a variety of problems that the Bayesian algorithm can be 
used quite generally to obtain good or optimal frequentist procedures. The calculation 
given in equation (1) can be taken as a strong indication that the fiducial algorithm 
can be used similarly to not only obtain confidence intervals, but also possibly good 
or optimal frequentist procedures more generally. This statement is too general to be 
provable, but we consider nonetheless this to be the main content and message in this 
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paper. The point of view in this paper does not rely on any particular interpretation of 
the fiducial. It is here simply viewed as a very convenient vehicle for the derivation of 
good, and sometimes optimal as in Theorem 1, frequentist inference procedures. 
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